Google CTF 2022: APPNOTE.TXT (misc)

問題文

Every single archive manager unpacks this to a different file...

問題概要

ZIPファイルが与えられるが、解凍してもテキストファイル一つしか得られない。

バイナリエディタでZIPファイルを眺めるとflagに関係するレコードが見つかり、それらレコードが解凍されていないことがわかる。

解法

問題名に見覚えがありAPPNOTE.TXTと検索するとZIPの仕様書であるAPPNOTE.TXTがヒットした（以前別の問題でZIPファイルの定義を調べたことがあるからだと思う）。ZIPファイルの仕様を利用してフラグを隠していそうだなと想像が付く。

とりあえずレコードを雑に走査するスクリプトを書いて実行してみた。（ここで配布ファイルとしてのZIPをそのまま解析するというミスをして問題が複雑になり時間を浪費した。）

すると次のことがわかる。

file comment lengthとZIP file comment lengthが長過ぎてコメントにPKから始まるレコードまで含んでしまっている。

レコードのcompression methodは0であり無圧縮である。

flag00,flag01,…,flag18のような名前のファイルのレコードがたくさんある。

flagから始まるファイルの中身は全て1文字であり、abcdefghijklmnopqrstuvwxyz{CTF0137}_から1文字選ばれており、その全ての文字に対応するflagファイルが存在している。

具体的には、ファイル名がflag00で中身がa、ファイル名がflag00で中身がb、…、ファイル名がflag00中身が_、というのがflag18まで続き、flagから始まるレコードは計19*36=684個ということ。

End of central directory record (EOCDR)が21個。

一般的にZIPファイルの解凍プログラムはEOCDRを見つけて解凍していくらしい。EOCDRはZIPファイルの末尾にある。

これら事実からフラグは19文字で、EOCDRからフラグを求めることが可能だとある程度は推測できる。今回のZIPファイルはコメントの長さを大きくするなどの細工をし、フラグを解凍されないようにしているっぽい。EOCDRの「offset of start of central directory with respect to the starting disk number」を利用してフラグの文字を抽出してみると、フラグが出てきた。

Flag: CTF{p0s7m0d3rn_z1p}

以下は走査ファイル兼solver。

code:py

file = open("dump.zip", "rb").read()

i = 0

def s(x):

global i

res = filei:i + x

i += x

return res

def b(x):

global i

i -= x

offsets = []

while i < len(file):

_i = i

header_signature = filei:i + 4

if header_signature == b'PK\x03\x04':

""" local file """

print("local file")

s(4)

print(f"{_i=}")

version_needed_to_extract = s(2)

general_purpose_bit_flag = s(2)

compression_method = s(2)

last_mod_file_time = s(2)

last_mod_file_date = s(2)

crc_32 = s(4)

compressed_size = int.from_bytes(s(4), "little")

uncompressed_size = int.from_bytes(s(4), "little")

file_name_length = int.from_bytes(s(2), "little")

extra_field_length = int.from_bytes(s(2), "little")

file_name = s(file_name_length)

extra_field = s(extra_field_length)

compressed_data = s(compressed_size)

print(f"{i=}")

print()

elif header_signature == b'PK\x01\x02':

""" central file """

print("central file")

s(4)

print(f"{_i=}")

version_made_by = s(2)

version_needed_to_extract = s(2)

general_purpose_bit_flag = s(2)

compression_method = s(2)

last_mod_file_time = s(2)

last_mod_file_date = s(2)

crc_32 = s(4)

compressed_size = s(4)

uncompressed_size = s(4)

file_name_length = int.from_bytes(s(2), "little")

extra_field_length = int.from_bytes(s(2), "little")

file_comment_length = int.from_bytes(s(2), "little")

disk_number_start = s(2)

internal_file_attributes = s(2)

external_file_attributes = s(4)

relative_offset_of_local_header = int.from_bytes(s(4), "little")

file_name = s(file_name_length)

extra_field = s(extra_field_length)

# file_comment

print(f"{i=}")

print()

elif header_signature == b'PK\x05\x06':

""" End of central directory record """

print("End of central directory record")

s(4)

print(f"{_i=}")

number_of_this_disk = s(2)

number_of_the_disk_with_the_start_of_the_central_directory = s(2)

total_number_of_entries_in_the_central_directory_on_this_disk = s(2)

total_number_of_entries_in_the_central_directory = s(2)

size_of_the_central_directory = int.from_bytes(s(4), "little")

offset_of_start_of_central_directory_with_respect_to_the_starting_disk_number = int.from_bytes(s(4), "little")

ZIP_file_comment_length = int.from_bytes(s(2), "little")

offsets.append(offset_of_start_of_central_directory_with_respect_to_the_starting_disk_number)

print(f"{i=}")

print()

else:

i += 1

flag = b""

for size_s in offsets:

flag += filesize_s - 1:size_s # bytesはone-basedで、indexはzero-basedだから-1

print(flag)

余談

仕様に出てくる「disk」とは？

フロッピーディスクのことらしい。

参考: What are "disks" in this context of the structure of ZIP files? - Stack Overflow